80 research outputs found
Detecting time-fragmented cache attacks against AES using Performance Monitoring Counters
Cache timing attacks use shared caches in multi-core processors as side channels to extract information from victim processes.
These attacks are particularly dangerous in cloud infrastructures, in which the deployed countermeasures cause collateral e ects in terms of performance loss and increase in energy consumption. We propose to monitor the victim process using an independent monitoring (detector) process, that continuously measures selected Performance Monitoring Counters (PMC) to detect the presence of an attack. Ad-hoc counter- measures can be applied only when such a risky situation arises. In our case, the victim process is the Advanced Encryption Standard (AES) encryption algorithm and the attack is performed by means of random encryption requests. We demonstrate that PMCs are a feasible tool to detect the attack and that sampling PMCs at high frequencies is worse than sampling at lower frequencies in terms of detection capabilities, particularly when the attack is fragmented in time to try to be hidden from detection.Instituto de Investigación en Informátic
Architecture-Aware Configuration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors
Asymmetric multicore processors (AMPs) have recently emerged as an appealing
technology for severely energy-constrained environments, especially in mobile
appliances where heterogeneity in applications is mainstream. In addition,
given the growing interest for low-power high performance computing, this type
of architectures is also being investigated as a means to improve the
throughput-per-Watt of complex scientific applications.
In this paper, we design and embed several architecture-aware optimizations
into a multi-threaded general matrix multiplication (gemm), a key operation of
the BLAS, in order to obtain a high performance implementation for ARM
big.LITTLE AMPs. Our solution is based on the reference implementation of gemm
in the BLIS library, and integrates a cache-aware configuration as well as
asymmetric--static and dynamic scheduling strategies that carefully tune and
distribute the operation's micro-kernels among the big and LITTLE cores of the
target processor. The experimental results on a Samsung Exynos 5422, a
system-on-chip with ARM Cortex-A15 and Cortex-A7 clusters that implements the
big.LITTLE model, expose that our cache-aware versions of gemm with asymmetric
scheduling attain important gains in performance with respect to its
architecture-oblivious counterparts while exploiting all the resources of the
AMP to deliver considerable energy efficiency
Algorithm 1033: Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Distributed-memory Architectures
This is the accepted version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published inACM Transactions on Mathematical Software. Volume 49, Issue 1, https://doi.org/10.1145/3573383The minimum distance of a linear code is a key concept in information theory. Therefore, the time required by its computation is very important to many problems in this area. In this article, we introduce a family of implementations of the Brouwer–Zimmermann algorithm for distributed-memory architectures for computing the minimum distance of a random linear code over 2. Both current commercial and public-domain software only work on either unicore architectures or shared-memory architectures, which are limited in the number of cores/processors employed in the computation. Our implementations focus on distributed-memory architectures, thus being able to employ hundreds or even thousands of cores in the computation of the minimum distance. Our experimental results show that our implementations are much faster, even up to several orders of magnitude, than current implementations widely used nowadays.The authors would like to thank the University of Alicante for granting access to the ua cluster. They also want to thank Javier Navarrete for his assistance and support when working on this machine. The authors would also like to thank Robert A. van de Geijn from the University of Texas at Austin for granting access to the skx cluster.Quintana-Ortà was supported by the Spanish Ministry of Science, Innovation and Universities under Grant RTI2018-098156-B-C54 co-financed by FEDER funds.
Hernando was supported by the Spanish Ministry of Science, Innovation and Universities under Grants PGC2018-096446-B-C21 and PGC2018-096446-B-C22, and by University Jaume I under Grant PB1-1B2018-10.
Igual was supported by Grants PID2021-126576NB-I00 and RTI2018-B-I00, funded by MCIN/AEI/10.13039/501100011033
and by “ERDF A way of making Europe”, and the Spanish CM (S2018/TCS-4423). This work has been supported by the Madrid Government (Comunidad de Madrid, Spain) under the Multiannual Agreement with Complutense University in the line Program to Stimulate Research for Young Doctors in the context of the V PRICIT (Regional Programme of Research and Technological Innovation) under project PR65-19/22445
Detecting time-fragmented cache attacks against AES using Performance Monitoring Counters
Cache timing attacks use shared caches in multi-core processors as side channels to extract information from victim processes.
These attacks are particularly dangerous in cloud infrastructures, in which the deployed countermeasures cause collateral e ects in terms of performance loss and increase in energy consumption. We propose to monitor the victim process using an independent monitoring (detector) process, that continuously measures selected Performance Monitoring Counters (PMC) to detect the presence of an attack. Ad-hoc counter- measures can be applied only when such a risky situation arises. In our case, the victim process is the Advanced Encryption Standard (AES) encryption algorithm and the attack is performed by means of random encryption requests. We demonstrate that PMCs are a feasible tool to detect the attack and that sampling PMCs at high frequencies is worse than sampling at lower frequencies in terms of detection capabilities, particularly when the attack is fragmented in time to try to be hidden from detection.Instituto de Investigación en Informátic
- …